Corpus-based synthesis of fundamental frequency contours with various speaking styles from text using F0 contour generation process model

نویسندگان

  • Keikichi Hirose
  • Kentaro Sato
  • Nobuaki Minematsu
چکیده

A corpus-based method of generating fundamental frequency (F0) contours of various speaking styles from text was developed. Instead of directly predicting F0 values, the method predicts command values of the F0 contour generation process model. Because of the model constraint, the resulting F0 contour keeps certain naturalness even when the prediction is done incorrectly. The method includes a scheme of automatic extraction of the model commands, which is necessary to prepare the training corpuses for various speaking styles. By introducing constraints on phrase command locations, a better extraction was realized, led to a better performance of the method. Speech synthesis was conducted using HMM speech synthesizer for calm speech and three types of emotional speech. The perceptual experiment showed the designated emotions could be well conveyed with the F0 contours generated by the developed method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Corpus-based Synthesis of Fundamental Frequency Contours with Varous Speaking Styles from Text Using F0 Contour Generation Process Model

A corpus-based method of generating fundamental frequency (F0) contours of various speaking styles from text was developed. Instead of directly predicting F0 values, the method predicts command values of the F0 contour generation process model. Because of the model constraint, the resulting F0 contour keeps certain naturalness even when the prediction is done incorrectly. The method includes a ...

متن کامل

Corpus-based synthesis of fundamental frequency contours based on a generation process model

A mode-constrained corpus-based synthesis strategy was developed for fundamental frequency (F0) contours of Japanese sentences. In the training phase, the relationship between linguistic factors and the command values (amplitudes and locations) of F0 contour generation process model was learned for a prediction module; a neural network in the current paper. Input parameters consist of linguisti...

متن کامل

Corpus-based Generation of Fundamental Fr Process Model and Considerin

We formerly conducted emotional speech synthesis using our corpus-based method of generating fundamental frequency (F0) contours from text. The method predicts command values of F0 contour generation process model instead of directly predicting F0 value of each time frame. A better control of F0 contours was realized by taking the emotional level of each bunsetsu into account: adding informatio...

متن کامل

Corpus-based generation of prosodic features from text based on generation process model

A total scheme of generating prosodic features from a text input was constructed. The method consists of corpus-based prediction of pauses, phone durations and fundamental frequencies (F0's), in this order, and information predicted in an earlier process is utilized in the following processes. Since prediction of F0's is done on the command values of F0 contour generation process model instead ...

متن کامل

Realization of Prosodic Focuses in Corpus-based Generation of Fundamental Frequency Contours of Japanese Based on the Generation Process Model

A method was developed for generating sentence F0 contours of Japanese, when a focus is placed in one of the “bunsetsu” of an utterance. It controls F0 based on the F0 model; not frame-byframe F0 prediction as in the case of HMM-based speech synthesis. The method first predicts differences in the F0 model commands between utterances with and without focus, and then applies them to the F0 model ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004